Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add pseudoinstructions #519

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

ThinkOpenly
Copy link
Contributor

[Draft]

An implementation is proposed for representing pseudoinstructions in Sail.

The realization is similar to instructions, with a few differences:

  1. union clause ast: Same.

  2. mapping clause assembly: Same.

  3. function clause execute:

    This calls the execute clauses for the respective base instructions as defined for each pseudoinstruction.

    Note: There are two pseudoinstruction examples in this PR: LA and VNEG. LA maps to two base instructions, and VNEG maps to one. The implementation for LA compiles, but the implementation for VNEG fails:

    
    Error:
    model/riscv_insts_end.sail:24.16-23:
    24 |function clause execute C_ILLEGAL(s) = { handle_illegal(); RETIRE_FAIL }
       |                ^-----^
       | Function execute calls function marked non-executable
    
    

    ... a message which leaves more questions than answers for me.

  4. mapping clause encdec: This is undefined for pseudoinstructions because, for one thing, some pseudoinstructions map to more than one instruction. This is replaced by a new scattered mapping: function clause pseudo_of: This maps the pseudoinstruction AST to a list of strings of the mapped instructions, for example:

    
    function clause pseudo_of(LA(rd, imm)) = [|
      "auipc" ^ spc() ^ reg_name(rd) ^ sep() ^ hex_bits_signed_20(imm[31..12]),
      "addi" ^ spc() ^ reg_name(rd) ^ sep()  ^"x0" ^ sep() ^ hex_bits_signed_12(imm[11..0])
    |]
    
    

    This is roughly analogous to function clause assembly, but instead of representing the instruction syntax, it represents the syntax of the mapped instructions.

@Alasdair
Copy link
Collaborator

Alasdair commented Aug 7, 2024

... a message which leaves more questions than answers for me.

Right now we don't allow the string parsing part of the assembly mapping to be used in the actual executable instruction semantics, i.e. the behaviour of the instruction cannot depend on parsing the assembly syntax. Any definition which uses the string parsing is marked as non-executable, so it cannot appear in the semantics as defined by the execute function. I think the error is probably pointing at the wrong clause.

|]

function clause execute VNEG(vs, vd) = {
/* execute(VXTYPE(VX_VRSUB, maybe_vmask(""), vs, reg_name("x0"), vd)) */
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

execute(VXTYPE(VX_VRSUB, 0b1, vs, 0b00000, vd))

I think should fix the issue you are seeing

Copy link

github-actions bot commented Aug 7, 2024

Test Results

712 tests  ±0   712 ✅ ±0   0s ⏱️ ±0s
  6 suites ±0     0 💤 ±0 
  1 files   ±0     0 ❌ ±0 

Results for commit bd4ef81. ± Comparison against base commit 05b845c.

♻️ This comment has been updated with latest results.

@ThinkOpenly ThinkOpenly force-pushed the upstream-pseudoinstructions branch from 36e4854 to c032279 Compare August 9, 2024 02:44
@jrtc27
Copy link
Collaborator

jrtc27 commented Aug 9, 2024

Pseudoinstructions are for assemblers. They are emphatically not part of the ISA and do not belong in an executable model thereof.


function clause pseudo_of(LA(rd, imm)) = [|
assembly(UTYPE(imm[31..12],rd,RISCV_AUIPC)),
assembly(ITYPE(imm[11..0],reg_name("x0"),rd,RISCV_ADDI))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is absolutely the wrong place to do this. Moreover, what certain pseudoinstructions expand to depend on the exact toolchain flags provided. For example, LA is either AUIPC+ADDI or AUIPC+L[WD] (W vs D based on XLEN) depending on whether using -fPIC. Even TAIL varies based on Zicfilp.

Copy link
Collaborator

@jrtc27 jrtc27 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comments.

@Timmmm
Copy link
Collaborator

Timmmm commented Aug 9, 2024

Maybe you could provide some details about the motivation for this, because at first glance I'd agree with Jessica. Doesn't seem like it belongs here.

@billmcspadden-riscv billmcspadden-riscv added the tgmm-agenda Tagged for the next Golden Model meeting agenda. label Aug 12, 2024
@Alasdair
Copy link
Collaborator

A different design might be have to a separate pseudo type, so you would have:

union clause pseudo =  LA : (regidx, bits(32))

val pseudo_to_instruction : pseudo -> ast

val pseudo_of : ...

this means the pseudo instructions are not mixed in with the regular executable instruction type, and can essentially live 'on the side'.

@jrtc27
Copy link
Collaborator

jrtc27 commented Aug 12, 2024

But they still don't belong anywhere in Sail. They're not part of the architecture.

@ThinkOpenly ThinkOpenly force-pushed the upstream-pseudoinstructions branch from c032279 to 55bd267 Compare August 21, 2024 00:51
[Draft]

An implementation is proposed for representing pseudoinstructions in Sail.

The realization is similar to instructions, with a few differences:

1. `union clause ast` -> `union clause pseudo`.

1. `mapping clause assembly` -> `mapping clause pseudo_assembly`.

1. `function clause execute` -> `function clause pseudo_execute`:

   This calls the `execute` clauses for the respective base instructions as
   defined for each pseudoinstruction.

1. `mapping clause encdec`: This is undefined for pseudoinstructions because,
   for one thing, some pseudoinstructions map to more than one instruction.
   This is replaced by a new scattered mapping:
   `function clause pseudo_of`: This maps the pseudoinstruction AST to a list
   of the assembly clauses (strings) of the mapped instructions, for example:

   ```
   function clause pseudo_of(LA(rd, imm)) = [|
     assembly(UTYPE(imm[31..12],rd,RISCV_AUIPC)),
     assembly(ITYPE(imm[11..0],reg_name("x0"),rd,RISCV_ADDI))
   |]

   ```

   This is roughly analogous to `function clause assembly`, but instead
   of representing the instruction syntax, it represents the syntax of the
   mapped instructions.
@ThinkOpenly ThinkOpenly force-pushed the upstream-pseudoinstructions branch from 55bd267 to bd4ef81 Compare August 21, 2024 01:15
@ThinkOpenly
Copy link
Contributor Author

[...] the pseudo instructions are not mixed in with the regular executable instruction type, and can essentially live 'on the side'.

I've rewritten to move the pseudostructions "on the side", with the suggested data structures. I've also moved everything to it's own file, model/riscv_pseudos.sail.

@Timmmm
Copy link
Collaborator

Timmmm commented Aug 21, 2024

Definitely an improvement but I'd still like to hear the motivation for this and how you plan to handle compiler options like -fPIC, -mcmodel, etc.

Is the idea to create a proper standard for how pseudo instructions behave (afaik there isn't one currently)?

@ThinkOpenly
Copy link
Contributor Author

Definitely an improvement but I'd still like to hear the motivation for this

The motivation is to enhance the Sail representation of RISC-V so it can serve as the root for downstream projects, including rudimentary disassemblers (like Capstone), the architectural-specific components of robust assemblers and disassemblers (binutils, LLVM), and much more. As it stands most of those implementations are done manually -- taking significant time and effort, and potentially introducing errors. This could be partially or fully automated. Given the steady stream of extensions being added to RISC-V, automation of implementation would seem like a useful gain in efficiency for the ecosystem.

and how you plan to handle compiler options like -fPIC, -mcmodel, etc.

It's useful to robustly define the pseudoinstructions, since they are an almost essential part of RISC-V programming/debugging. There was once a list in the ISA document. That's been removed. They're now in an obscure corner of the RISC-V GitHub presence in the "RISC-V Assembly Programmer's Manual".

Choosing Load Address ("la") was maybe not the best choice as an example, given its oddities. The definitions should reflect what the pseudoinstructions do. Since "la" has two implementations, depending on PIC, there should probably be two realizations, say "LA" and "LA_PIC", but I'm open to ideas.

Is the idea to create a proper standard for how pseudo instructions behave (afaik there isn't one currently)?

I think programmers need a robust assembly language reference. There exists no such thing at present. I'm trying to build one based on the RISC-V Sail specification, but it doesn't have all the required pieces. I argue that it should. The (obviously under construction) site is https://thinkopenly.github.io/RISC-V_ISA/.

@jrtc27
Copy link
Collaborator

jrtc27 commented Aug 21, 2024

They're now in an obscure corner of the RISC-V GitHub presence in the "RISC-V Assembly Programmer's Manual".

It's not obscure, it's easy to find. Would you call the psABI obscure too, which also used to be in the ISA manual?

Choosing Load Address ("la") was maybe not the best choice as an example, given its oddities. The definitions should reflect what the pseudoinstructions do. Since "la" has two implementations, depending on PIC, there should probably be two realizations, say "LA" and "LA_PIC", but I'm open to ideas.

Well the way it's defined is that LGA and LLA are the two pseudo-instructions, and then LA picks one of the two based on flags (pseudo pseudo?)

I think programmers need a robust assembly language reference. There exists no such thing at present. I'm trying to build one based on the RISC-V Sail specification, but it doesn't have all the required pieces. I argue that it should. The (obviously under construction) site is https://thinkopenly.github.io/RISC-V_ISA/.

What about the syntax for referencing symbols? All the various directives? These things are a core part of being able to write proper assembly sources that aren't just raw instructions, and are generally OS-specific (typically Linux/BSD vs Darwin vs Windows). Even for AArch64 they're OS-specific. That's never going to be in the Sail model. So my view is that the line should be drawn at architecture, rather than some fuzzy line that includes pseudo-instructions.

@jrtc27
Copy link
Collaborator

jrtc27 commented Aug 21, 2024

I think programmers need a robust assembly language reference. There exists no such thing at present.

Also, on that point, if you think the current assembly manual is insufficient, please contribute your own improvements to it.

@jrtc27
Copy link
Collaborator

jrtc27 commented Aug 21, 2024

The (obviously under construction) site is https://thinkopenly.github.io/RISC-V_ISA/.

This does look quite cool and useful. But it would be even better if it had all the spec's prose under each instruction, not just the Sail. At which point, what you really want is a machine-readable version of the manual, and then it really does not matter whether the name is in the Sail model, you can just as easily get it from the manual. So making the manual be machine-readable, able to generate the classic PDF and such interactive webpages, seems like the place to invest effort. Then you can just extract the Sail out of the model to insert inline.

@ThinkOpenly
Copy link
Contributor Author

They're now in an obscure corner of the RISC-V GitHub presence in the "RISC-V Assembly Programmer's Manual".

It's not obscure, it's easy to find.

As recently as late June, the "Documentation Build & Release Engineer" for RVI was unaware of the existence of this manual (as was I): riscv/riscv-isa-manual#1470 (comment)

...I suspect we are not alone.

Would you call the psABI obscure too, which also used to be in the ISA manual?

Yes, I would.

What about the syntax for referencing symbols? All the various directives? These things are a core part of being able to write proper assembly sources that aren't just raw instructions, and are generally OS-specific (typically Linux/BSD vs Darwin vs Windows). Even for AArch64 they're OS-specific. That's never going to be in the Sail model. So my view is that the line should be drawn at architecture, rather than some fuzzy line that includes pseudo-instructions.

Yep, "la" is challenging, but I don't think that justifies giving up entirely.

if you think the current assembly manual is insufficient, please contribute your own improvements to it.

For one big thing, it's missing all of the base instructions.

@ThinkOpenly
Copy link
Contributor Author

The (obviously under construction) site is https://thinkopenly.github.io/RISC-V_ISA/.

This does look quite cool and useful.

Thank you.

But it would be even better if it had all the spec's prose under each instruction, not just the Sail.

I'd love to add the spec's prose under each description, and we have done a few stabs at that in my fork. See ThinkOpenly@da29d41 as an example. But, given the reception to just adding instruction names, I'm not sure that would be well-received.

At which point, what you really want is a machine-readable version of the manual, and then it really does not matter whether the name is in the Sail model, you can just as easily get it from the model.

I don't understand that. Is that last word supposed to be "manual"?

So making the manual be machine-readable, able to generate the classic PDF and such interactive webpages, seems like the place to invest effort. Then you can just extract the Sail out of the model to insert inline.

This is the chicken-egg paradox. Should the Sail model be left pristine and the manual be enhanced to be machine-readable, incorporating bits from the model, or should there be a single source for model and manual? I'd heard that Sail was intended to be the single source of truth, so I've focused my efforts at pushing into Sail the things that are missing from the monolithic model/manual approach. Note, that it's not exactly a one-and-done effort. There are phases and iterations and a lot of effort and time. Baby steps. If there's a grand strategy of which I am not aware, I'd like to understand better.

@jrtc27
Copy link
Collaborator

jrtc27 commented Aug 21, 2024

At which point, what you really want is a machine-readable version of the manual, and then it really does not matter whether the name is in the Sail model, you can just as easily get it from the model.

I don't understand that. Is that last word supposed to be "manual"?

Uh yes. Fixed.

@jrtc27
Copy link
Collaborator

jrtc27 commented Aug 21, 2024

They're now in an obscure corner of the RISC-V GitHub presence in the "RISC-V Assembly Programmer's Manual".

It's not obscure, it's easy to find.

As recently as late June, the "Documentation Build & Release Engineer" for RVI was unaware of the existence of this manual (as was I): riscv/riscv-isa-manual#1470 (comment)

For someone whose job is managing RVI's documentation, that's concerning, but not inconsistent with my impressions of RVI... I mean, it's listed under https://github.com/riscv-non-isa like every other non-ISA specification, which itself is the second link in the list at https://github.com/riscv.

...I suspect we are not alone.

It does show up as the first Google result for "riscv assembly syntax" for me. It's not listed on https://wiki.riscv.org/display/HOME/RISC-V+Technical+Specifications (linked from https://riscv.org/technical/specifications/) because it's never had a formal release, but that page also links to https://github.com/riscv-non-isa at the bottom of the Non-ISA table. I'm not sure how else it could be more discoverable beyond having a ratified version to sit alongside specs like the psABI.

Would you call the psABI obscure too, which also used to be in the ISA manual?

Yes, I would.

Ok, but that's never going in the Sail model. The point is, there is a line somewhere between architecture and software.

But it would be even better if it had all the spec's prose under each instruction, not just the Sail.

I'd love to add the spec's prose under each description, and we have done a few stabs at that in my fork. See ThinkOpenly@da29d41 as an example. But, given the reception to just adding instruction names, I'm not sure that would be well-received.

So, this is in fact what we do for CHERI (with some crazy LaTeX macros to slice-and-dice Sail's own LaTeX output), and I'm not opposed to it, in fact I'm a fan of the concept, it works well for us. But it requires buy-in from RISC-V International to do properly, and is yet another huge shift in RISC-V documentation (on the heels of the slow LaTeX -> AsciiDoc conversion). What I'm opposed to is a weird in-between state where the ISA manual is the source of truth for the prose, but the Sail model duplicates some of it as a machine-readable version.

@Timmmm
Copy link
Collaborator

Timmmm commented Aug 21, 2024

I suspect we are not alone.

Yeah I can back you up on that. Took me maybe 6 months of working on RISC-V before I found it and I've introduced colleagues to it. Fairly obscure.

I'm not sure how else it could be more discoverable

I would put it on this page. And the RVA profiles. Honestly I think the RISC-V documentation discovery pages like that one, this one, this one etc. are not very good. They're written under the assumption that you a) read all the prose, b) already know what you're looking for and c) have all the time in the world. That's a rant for another time though. :-)

What about the syntax for referencing symbols? All the various directives?

Yeah I was wondering that too. Is the model going to become a fully featured assembler? Probably not desirable. On the other hand I don't think we should say "don't do assembly at all because we don't want it to become a fully featured assembler" - clearly some support for (dis)assembly is desirable and I don't see it as a slippery slope.

The (obviously under construction) site is https://thinkopenly.github.io/RISC-V_ISA/.

This is cool, I'm totally stealing this effort for my extension, thanks! :-D

But it requires buy-in from RISC-V International to do properly, and is yet another huge shift in RISC-V documentation (on the heels of the slow LaTeX -> AsciiDoc conversion). What I'm opposed to is a weird in-between state where the ISA manual is the source of truth for the prose, but the Sail model duplicates some of it as a machine-readable version.

Yeah maybe, but neither of Paul's PRs actually add any of the prose yet so IMO we can defer that discussion some more. There are also slightly awkward versioning issues to deal with there too & we haven't even got the Sail code into the ISA manual yet.

I wouldn't oppose merging this in future when it's more fleshed out and has a non-WIP use case.

IMO the other PR (adding names) could be merged now because it's strictly better than comments explaining wtf aiupc means and I would say comments like that are helpful.

@ThinkOpenly
Copy link
Contributor Author

I just stumbled upon model/riscv_insts_rmem.sail. Does that not establish some precedent for adding pseudoinstructions to the Sail code? (It's not obvious to me what it is, though, as there are very few actual references to it, and precious few comments.)

/* This file specifies some internal pseudo instructions used by RMEM. */

These pseudoinstruction definitions are in the same namespaces as the base instructions:

union clause ast = THREAD_START : unit
[...]
mapping clause encdec = THREAD_START()
  <-> 0xc0de @ 0b00000000 @ 0b0 @ 0b00 @ 0b010 @ 0b11

function clause execute (THREAD_START()) = RETIRE_SUCCESS

mapping clause assembly = THREAD_START() <-> "thread_start"

(Arguably, they aren't actually pseudoinstructions, because they aren't mnemonics which map to base instruction(s). They are actual instructions which aren't actually part of the model?)

@jrtc27
Copy link
Collaborator

jrtc27 commented Sep 12, 2024

Those are not pseudo instructions in the same meaning of the term. Those are (ab)using free encoding space to define new encodings that can be used by rmem as an implementation detail.

@Timmmm
Copy link
Collaborator

Timmmm commented Sep 13, 2024

It's not obvious to me what it is, though, as there are very few actual references to it, and precious few comments.

Yeah I thought the same. It's something to do with this but it would definitely be worth putting some comments in that file to explain it.

Timmmm pushed a commit to Timmmm/sail-riscv that referenced this pull request Oct 2, 2024
Support limits_icount_to_one in IcountTest
@Timmmm Timmmm removed the tgmm-agenda Tagged for the next Golden Model meeting agenda. label Dec 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants